Automatic Analysis Of Descriptive Texts
نویسنده
چکیده
This paper d e s c r i b e s a sys t em t h a t a t t e m p t s to i n t e r p r e t d e s c r i p t i v e t e x t s w i t h o u t the uJe of complex grammars. The pu rpose of the sys t em i s to t r a n s f o r m the d e s c r i p t i o n s to a s t a n d a r d form which may be used as the b a s i s of a d a t a b a s e s y s tem knowledgeable i n the s u b j e c t m a t t e r of the teXt. The t e x t s c u r r e n t l y used a re w i ld p l a n t d e s c r i p t i o n s taken d i r e c t l y from a p o p u l a r book on the s u b j e c t . P r o p e r t i e s such as s i z e , shape and c o l o u r a r e a b s t r a c t e d f rom the d e s c r i p t i o n s and r e l a t e d to p a r t s of the p l a n t in which we a re i n t e r e s t e d . The r e s u l t i n g o u t p u t i s a s t a n d a r d ined h i e r a r c h i c a l s t r u c t u r e h o l d i n g on ly s i g n i f i c a n t features of the d e s c r i p t i o n . The sys t em, implemented i n t he PROLOG p r o gramming l a n g u a g e , u s e s keywords co i d e n t i f y the way segments of the t e x t r e l a t e to the o b j e c t d e s c r i b e d . I n f o r m a t i o n on words is he ld in a keyword l i s t of nouns r e l a t i n g to p a r t s of the o b j e c t d e s c r i b e d . A d i c t i o n a r y c o n t a i n s the a t t r i b u t e s of o r d i n a r y words used by the sy s t e m to a n a l y s e the t e x t . The t e x t i s d i v i ded i n t o seE" ments u s i n g i n f o r m a t i o n p rov ided by c o n j u n c t i o n s and p u n c t u a t i o n . About half the texts processed are correctly analysed at present. Proposals are made for f u t u r e work to improve this figure. There seems Co be no inherent reason why the technique cannot be generalised so chac any text of seml-standard descriptions can be automatically converted to a canonical form.
منابع مشابه
Uma Ferramenta para Identificar Desvios de Linguagem na Língua Portuguesa (A tool to identify the linguistic deviations in the Portuguese Language)[In Portuguese]
Abstract. The revision of formal texts is a complex task and occurs in several areas. The objective of this work is to create a tool to support the revision of texts and promote studies in automatic correction of descriptive texts. We propose a reviewer for automatic identification of language deviations in formal descriptive texts using natural language processing techniques. A case study...
متن کاملNatural Language Processing And Ihe Automatic Acquisition Of Knowledge: A Simulative Approach
The paper presents the general design and the f i r s t results of a research project whose long term goal is to develop and implement ALICE, an experimental system capable of augmenting i ts knowledge base by processing natural language texts. ALICE (an acronym for Automatic Learning and Inference Computerized Engine) is an attempt to model the cognitive processes that occur in humans when the...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملLessons from building a Persian written corpus: Peykare
This paper addresses some of the issues learned during the course of building a written language resource (called ‘Peykare’) for contemporary Persian. After defining five linguistic varieties and 24 different registers based on these linguistic varieties, we collected the texts for Peykare to do a linguistic analysis, including cross-register differences. For tokenization of Persian, we have pr...
متن کاملLearning to Read Bushman: Automatic Handwriting Recognition for Bushman Texts
The Bleek and Lloyd Collection contains notebooks that document the tradition, language and culture of the Bushman people who lived in South Africa in the late 19th century. Transcriptions of these notebooks would allow for the provision of services such as textbased search and text-to-speech. However, these notebooks are currently only available in the form of digital scans and the manual crea...
متن کاملThe Mediating Role of Automatic Thoughts in Relationship Between Attachment Style with Sexual Dysfunction and Marital Commitment: A Path Analysis
Background: This article explores the effects of attachment style and automatic thoughts on sexual dysfunction and marital commitment, using the path analysis model. This descriptive-correlational study was conducted on 375 married female students in Shahid Chamran University of Ahvaz, Iran, from 2016 to 2017. Methods: According to Morgan and Jersey table and the statistical population (375 pe...
متن کامل